16 research outputs found

    BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

    Get PDF
    The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed

    SugarDrawer: A Web-Based Database Search Tool with Editing Glycan Structures

    No full text
    In life science fields, database integration is progressing and contributing to collaboration between different research fields, including the glycosciences. The integration of glycan databases has greatly progressed collaboration worldwide with the development of the international glycan structure repository, GlyTouCan. This trend has increased the need for a tool by which researchers in various fields can easily search glycan structures from integrated databases. We have developed a web-based glycan structure search tool, SugarDrawer, which supports the depiction of glycans including ambiguity, such as glycan fragments which contain underdetermined linkages, and a database search for glycans drawn on the canvas. This tool provides an easy editing feature for various glycan structures in just a few steps using template structures and pop-up windows which allow users to select specific information for each structure element. This tool has a unique feature for selecting possible attachment sites, which is defined in the Symbol Nomenclature for Glycans (SNFG). In addition, this tool can input and output glycans in WURCS and GlycoCT formats, which are the most commonly-used text formats for glycan structures

    The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application

    No full text
    Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).</p

    GlycoRDF : an ontology to standardize glycomics data in RDF

    No full text
    Motivation: Over the last decades several glycomics-based bioinformatics resources and databases have been created and released to the public. Unfortunately, there is no common standard in the representation of the stored information or a common machine-readable interface allowing bioinformatics groups to easily extract and cross-reference the stored information. Results: An international group of bioinformatics experts in the field of glycomics have worked together to create a standard Resource Description Framework (RDF) representation for glycomics data, focused on glycan sequences and related biological source, publications and experimental data. This RDF standard is defined by the GlycoRDF ontology and will be used by database providers to generate common machine-readable exports of the data stored in their databases. Availability and implementation: The ontology, supporting documentation and source code used by database providers to generate standardized RDF are available online (http://www.glycoinfo.org/GlycoRDF/). Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.7 page(s

    Latest developments in Semantic Web technologies applied to the glycosciences

    Get PDF
    The Integrated Life Science Database Project of Japan funded a group of glycoscientists to carry out a project to integrate glycoscience databases using Semantic Web technologies. As a continuation of the previous project period, the Japan Consortium for Glycobiology and Glycotechnology Database (JCGGDB) developed several glycoscience-related databases. The GlycoProtDB database is among those being integrated, providing an important resource to understand protein glycosylation. Another database being integrated is GlycoEpitope, a comprehensive database of carbohydrate epitopes and antibodies. In the current project period, we started the development of GlyTouCan, the international glycan structure repository providing unique accession numbers to all glycan structures. Although such databases are sufficiently important in and of themselves, their integration with other—omics data such as the protein information in UniProt will be crucial to bring glycosciences to the forefront of life sciences. However, to integrate such disparate sets of data among different fields in a way such that future maintenance costs are minimal, standardized ontologies and formats must be established. Our latest project has attempted to define the minimal standards that are necessary to enable this integration. The technical challenges to integrate all these databases and the technologies to overcome these challenges will be described

    Introducing glycomics data into the Semantic Web

    Get PDF
    Background: Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as “switches” that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. Results: In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as “proofs-of-concept” to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. Conclusions: We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains.7 page(s

    WURCS: The Web3 Unique Representation of Carbohydrate Structures

    No full text
    In recent years, the Semantic Web has become the focus of life science database development as a means to link life science data in an effective and efficient manner. In order for carbohydrate data to be applied to this new technology, there are two requirements for carbohydrate data representations: (1) a linear notation which can be used as a URI (Uniform Resource Identifier) if needed and (2) a unique notation such that any published glycan structure can be represented distinctively. This latter requirement includes the possible representation of nonstandard monosaccharide units as a part of the glycan structure, as well as compositions, repeating units, and ambiguous structures where linkages/linkage positions are unidentified. Therefore, we have developed the Web3 Unique Representation of Carbohydrate Structures (WURCS) as a new linear notation for representing carbohydrates for the Semantic Web
    corecore